Comments for MEDB 5501, Week 6
Assessing normality
Problems caused by non-normality
Poor confidence intervals, hypothesis tests
Too much imprecision
Poor coverage probability
Especially for one tailed tests
Inability to extrapolate
What about the Central Limit Theorem?
How to handle non-normality
Ignore it
Central Limit Theorem
Transform your data
Use alternatives
Nonparametric tests (covered in a later module)
Bootstrap (covered in a later module)
Randomization tests (not covered in this class)
Normal histogram with n=100
Normal histogram with n=1,000
Normal histogram with n=10,000
Normal histogram with n=100,000
Normal histogram with n=1,000,000
Normal distribution (n=infinity)
Normal(2, 1)
Normal(-1, 1)
Normal(0, 2)
Normal(0, 0.5)
Skewed right
Right skewness is characterized by the tails of the distribution
Heavy right tail
Greater tendency to produce extreme values on the right
Light left tail
Lesser tendency to produce extreme values on the left
Right skewness is the most common type of non-normality
Normal probability plot
Compare data to evenly spaced percentiles of the normal distribution
Example with n=4
Compare smallest value with
\(Z_{0.2}\)
Compare next value with
\(Z_{0.4}\)
Compare next value with
\(Z_{0.6}\)
Compare largest value with
\(Z_{0.8}\)
No best definition for evenly spaced
12.5, 37.5, 62.5, 87.5, for example
Histogram for right skewed data
Histogram for left skewed data
Histogram for heavy tailed data
Histogram for light tailed data
Histogram for bimodal data
Histogram for normal data